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ABSTRACT 

12.  SPONSO  RING  Ml  LI  T ARY  ACTIVITY 

Office  of  Naval  Research 

Washington,  D.C.  200lU 

This  paper  has  been  prepared  for  delivery  as  an  opening  address  at  a Symposium 
at  Chapel  Hill,  N.C.,  intended  to  honor  Professor  Wassily  Hoeffding.  The  contents 
of  the  paper  are  summarized  in  the  titles  of  its  five  chapters  marked  with 
Roman  numerals  II  to  VI,  as  follows:  II.  The  Cramer-Hoeffding  research  incident 
(=  the  importance  the  theory  of  large  deviations  initiated  by  Cramer  for  the 
asymptotic  theory  of  statistical  tests).  III.  Two  different  strategies  in  mathema- 
tical statistics.  IV.  The  Yule-Polya  research  incident:  X-^t'^mechanism  of  a 
natural  phenomenon,  and  Jett)  non-identifiability . V.  Some  modern  recurrences 
of  the  Yule-Polva  problem.  VI.  Effort  at  an  ""optimal “'“'competitor  to  the  K.P.'s 
chi  square  test.  Possibly,  the  most  important  unresolved  problem  is  that  of  the 
"residual  non-identifiability  of  the  serial  sacrifice  experimental  design, 
discussed  in  Chapter  V. 
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I • INTRODUCTION 

1.  Congratulations  to  Professor  Hoeffding.  I am  very 
grateful  to  Professor  Chakravarti  for  his  invitation  to  open 
the  discussion  at  this  Symposium  intended  to  honor  Professor 
Wassily  Hoeffding.  We  met  long  ago  and  from  the  very  begin- 
ning, it  was  a pleasure  to  find  a marked  similarity  in  our 
research  interests.  After  the  joint  work  with  Egon  S.  Pear- 
son [1933]  concerned  with  power  functions  and  later,  after  the 
development  of  the  theory  of  confidance  intervals  [1937a],  my 
research  efforts  focused  on  the  deduction  of  variously  defined 
"optimal"  statistical  methodologies  [1959]  that  could  be  easily 
used  in  studies  of  natural  phenomena.  Against  this,  here  is 
the  title  of  Professor  Hoeffdinq's  paper:  "Optimal  nonpara- 
metric  tests"  [1951]  he  delivered  at  the  Second  Berkeley  SvmDO- 
sium  on  Statistics  and  Probability  held  during  the  summer  of 
1950,  more  than  a ouarter  of  a century  ago.  Since  that  time 
our  intellectual  contacts  continued,  but  our  personal  encount- 
ers were  "like  Victoria  Regina:  seldom,  seldom  in  bloom." 

Incidentally,  the  problem  of  the  optimal  non-parametric 
tests  of  composite  statistical  hvpotheses  is  still  "on  the 
books. " 
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II.  THE  CRAME'P-HOf FFPING  RESEARCH  INCIDENT 


2.  The  Harald  Cramer  Ground  Greakinc^  Paper  of  1938. 

The  mathematical  tool  most  frequently  used  in  the  development 
of  statistical  methods  is  the  Central  Limit  Theorem  on  prob- 
abilities, roughly  as  follows.  Let  {X  ) be  a sequence  of  ran- 

• n 

2 2 

dom  variables  each  having  two  moments,  EX  =0  and  EX  *<?«,«< 
and  let 

vj,v  <>> 

Then,  under  certain  conditions, 

t 2 

lim  P { Sn  = to.n}  = -1-  / e‘u  12  du  = *( t)  (2) 

n v * 7n  -« 

for  any  preassioned  real  number  t.  This  theorem  preoccupied 
mathematicians  for  a couple  of  centuries  now  [Loeve,  1980].  The 
successive  oroofs  given  differ  in  the  generality  of  the  "certain 
conditions"  just  mentioned.  This  long  duration  of  efforts  to 
prove  the  validity  of  formula  (2)  resulted  in  the  establish- 
ment of  a "routine  of  thought."  Whenever  some  particular 
problems  of  mathematical  statistics  involved  the  consideration 
of  sums  of  random  variables  like  (1),  with  the  value  of  n con- 
sidered "large,"  it  became  customary  to  presume  that  formula 
(2)  gives  a satisfactory  approximation  of  the  true  distribution 
of  $ . The  word  "customary"  is  not  adeouate.  The  breaking  of 


a "routine  of  thought"  stimulates  opposition. 


Among  other  things,  the  classical  central  limit  theorem 
was  used  to  compare  the  effectiveness  of  statistical  tests. 

Here,  the  term  Pitman  asymptotic  efficiency  comes  to  mv  mind. 

As  described  bv  Yu. V.  Linnik  [1961].  the  honor  of  breakina 
this  firmly  established  routine  of  thought  belonos  to  Harald 
Cramer.  In  1938,  iust  before  the  beoinning  of  World  War  II, 
there  appeared  Cramer's  paoer  [19381  offerinci  the  first  solution 
to  a novel  question  that  Cramer  dared  to  ask.  Briefly,  it  is 
as  follows. 

With  reference  to  formula  (1)  assume  that  all  the  variables 
of  the  seouence  f Xn > are  mutually  indeoendent  and  identically 
distributed.  Consider  the  probability 

Fn(tn>  = P{Sn  = ’ {3) 

where  t grows  to  infinity  as  n is  increased.  Cramer's  ground 
breaking  Question  was  about  the  asymptotic  behavior  of  the  ratio 


depending  on  Droperties  of  the  variables  Xn  and  on  the  rate  of 
increase  of  tn_  This  paper  generated  a new  chapter  of  prob- 
ability theory,  labeled  “theory  of  larqe  deviations"  r Linnik.  1961], 

Briefly  androuohlv,  the  imoortant.  Question  was  whether  1-  (t  ) can 

n 

be  considered  as  a satisfactory  approximation  of  the  probability 
that  the  sum  Sn  will  exceed  a limit  proportional  to  tn»/n. 
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3.  Professor  Hoeffdinq^s  Initiative  to  Use  the  Novel 
Probabilistic  Tool.  While  it  is  obvious  that  Cramer's  limit 
theorem  on  large  deviations  must  be  a better  tool  for  studying 
the  asymptotic  properties  of  statistical  tests  than  is  the 
classical  central  limit  theorem,  the  disasters  and  the  length 
of  World  War  11  were  not  conducive  to  the  development  of  con- 
ceptual mathematical  subdisciplines.  In  consequence,  the  rele- 
vance of  the  Cramer  ground  breaking  work  remained  unnoticed  for 
almost  two  decades.  Here,  a paper  by  Professor  Hoeffding  [NbS] 
olaved  a special  role. 

The  title  of  this  paoer  is: 

''Asymptotically  optimal  tests  for  mul tinomial  distribution." 
Professor  Hoeffding  begins  by  formulating  his  own  definition  of 
asymptotic  optimal i tv  and  then  states:  "To  attack  these  prob- 
lems, the  theory  of  probabilities  of  large  deviations  is  needed. 
This  is  followed  bv  proofs  that,  under  specified  conditions, 
certain  familiar  tests  (the  likelihood  ratio  and  the  chi  square 
tests)  are  asymptotically  optimal  in  the  sense  of  the  new,  call 
it,  Hoeffding  definition  of  optimality. 

Professor  Hoeffdinq's  paper  was  presented  at  a meeting  of 
the  I^S  and  the  discussion  that  followed  is  recorded  in  the 
Annals.  It  appeared  that,  even  though  Cramer’s  theorem  on 
large  deviations  was  familiar  to  several  statisticians , in- 
cluding H.  Chernoff,  R.A.  Wi.isman  and  r'.G.  Chapman,  Professor 
Hoeffding  must  be  credited  with  the  first  serious  effort  to 
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see  what  the  novel  probabi 1 istic  tool  can  contribute  to  the 
theory  of  asymptotic  tests. 

Incidentally,  published  in  1965,  fourteen  years  aao, 
Hoeffding's  paper  continues  to  affect  the  thinking  of  this  day. 
The  following  Quote  is  from  a paper  published  in  the  last  issue 
of  the  Zei tschri ft  fUr  Wahrscheinl ichkei tstheorie  und  varwandete 
Gebiete  [Berk  and  Jones,  1979]:  "The  first  [lemma]  is  actually  a 
special  case  of  a theorem  of  Hoeffding  (1965),  Theorem  2.1." 

My  hearty  compliments  to  Professor  Hoeffding! 

4 . Reasons  for  Preferring  the  Theory  of  Large  Deviations 
as  a Tool  for  Studying  Asymptotic  Tests.  The  word  "preferring" 
in  the  title  of  the  present  section  emphasizes  its  subjective 
characters.  It  has  to  do  with  the  meaning  I attach  to  the 
terms  "errors  of  the  first  and  second  kinds"  possible  to  com- 
mit in  testing  a statistical  hypothesis. 

As  described  in  [1977a]  in  the  course  of  an  empirical  study 
one  is  freouently  faced  with  a two-decision  problem.  Depend- 
ing uDon  the  outcome  of  the  statistical  test  used,  one  has  to 
decide  to  go,  say,  either  "right"  or  "left,"  and  either  decision 
can  be  erroneous.  Depending  upon  personal  attitudes,  one  of 
the  two  errors  will  be  judged  more  important  to  avoid  than  the 
other.  My  definition  is:  the  error  that  is  more  important  to 
avoid  is  called  the  error  of  the  "first  kind."  In  consequence. 


when  selecting  a test  to  be  used  in  a particular  empirical  study, 
my  first  concern  is  to  make  sure  that  the  Drobability  of  com- 
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mitting  an  error  of  the  first  kind  does  not  exceed  a pre- 
assiqned  level  a,  now  called  "level  of  significance."  Denend- 
inq  upon  the  subiective  feelinq  of  importance,  the  chosen  level 
of  significance  mav  be  '»=0.10,  or  a=f).05  or  a=0.01,  etc. 

When  the  Drobler  of  the  desired  level  of  significance  is 
solved  and  if  it  can  be  ensured  by  any  test  of  some  determined 
class,  the  time  comes  to  think  of  the  less  important  error,  the 
error  of  the  "second  kind,"  which  means  to  determine  the  most 
r oowerful  test  within  the  class  considered. 

This  is  the  background  of  my  preference  for  the  theory 
1 of  large  deviations  as  a tool  in  the  theory  of  asymptotic  tests 

as  comoared  with  the  classical  central  limit  theorem. 

In  an  empirical  study  involving  a two-decision  problem, 
one  is  faced  with  some  real  life  situation,  with  some  hypothe- 
sis  which  can  be  true  or  false  and  with  the  degree  of  its  false- 
hood measured  bv  a parameter  the  value  of  which  is  unknown. 
The  onlv  thing  that  is  under  our  control,  at  least  to  some  ex- 
tent, is  the  number  n of  observations  that  can  be  used  to  test 
the  hypothesis  that  :.=0.  The  all  important  question  is  whether 
this  particular  number  n is  large  enough  to  achieve  the  chosen 
level  of  significance  a.  The  answer  depends  on  how  close  the 

. 

ratio  (4)  is  to  unitv,  which  is  the  subject  of  Cramdr's  theory 
of  large  deviations,  including  its  modern  descendants.  The 
use  of  this  theory  does  not  violate  the  real  life  situation  of 
the  problem,  with:  having  some  unknown  fixed  value. 

u 


! 

It 


' 
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Now  consider  the  asvmptotic  test  possibilities  offered 

by  the  classical  central  limit  theorem  on  probabilities.  As 

is  well  known,  both  the  Pitman  asymptotic  efficiency  theorv 

and  the  theories  of  asymptotic  tests  developed  bv  Cramgr  [19?8]  and 

bv  myself  [10501  denend  on  visual izing  that  the  r^al  life  problem,  sav, 

the  problem  of  testinq  considered  today,  is  a member  of  a 

hypothetical  sequence  with  the  fixed  unknown  - replaced  by  r , 

such  that  the  product  • >n  is  bounded  away  from  zero  and 

n 

infinity,  preferably  tending  to  some  known  limit.  This  is 
somethinq  very  di fferent  from  and  much  less  inspirinq  than  the 
ouestion  of  how  close  to  unity  is  the  value  of  (a). 

HI.  TWO  DI FFERENT  STRATEGIES  IN  MATHEMATICAL  STATIST  ICS 

5 . A Curious  Detail  of  the  H istorv  of  Statistical  Tests . 

The  Cramgr-Hoeffdinq  research  incident  described  in  sections  2 
and  .1  illustrates  a curious  detail  of  the  history  of  statisti- 
cal tests,  particularly  of  the  early  history.  The  customary 
strategy  is  composed  of  two  consecutive  steps,  (i)  A statis- 
tician concerned  with  some  empirical  domain  proposes  a testing 
procedure  sugqested  by  his  intuition.  Then,  (ii)  an  effort 
is  made  to  investioate  the  properties  of  this  procedure,  occa- 
sionally leadinq  to  the  conclusion  that  it  is  in  some  sense 
"optimal.'  Examples  of  this  sequence  (i)-(ii)  are  countless. 
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i. 


The  first  test  procedure,  still  in  very  frequent  use,  is 
the  chi  square  test  introduced  by  Karl  Pearson  in  1900.  It 
was  one  of  the  subjects  studied  in  the  Hoeffding  paoer  .just 
discussed.  The  other  test  discussed  in  the  same  Hoeffding 
paper  is  the  likelihood  ratio  test.  As  stated  by  Professor 
Hoeffdinq,  the  likelihood  ratio  criterion  was  suqqested  by 
E.S.P.  and  myself  in  1928.  However,  this  suagestion  was  made 
on  intuitive  qrounds.  The  criterion  suggested  did  not  result 
from  a search  for  a procedure  satisfying  a defined  concept  of 
optimality.  The  intuitive  background  of  the  likelihood  ratio 
test  was  simply  as  follows:  if  among  the  contemplated  admis- 
sible hypotheses  there  are  some  that  ascribe  to  the  facts 
observed  probabilities  much  larger  than  that  ascribed  by  the 
hypothesis  tested,  then  it  appears  "reasonable"  to  reject 
that  hypothesis. 

As  another  example,  I wish  to  mention  a test  criterion 
competitive  to  the  chi  square,  first  suggested  by  Harald 
Cramer  [1928]  and  somewhat  later  also  advanced  bv  Richard  von 
Mises  [1931]. 

The  alternative  philosophy,  or  strategy,  is  just  the  op- 
posite to  the  seauence  (i)  and  (ii).  When  one  has  to  deal  with 
an  empirical  domain  of  study  and  one  feels  in  need  of  a statis- 
tical Drocedure,  it  seems  natural  to  visualize  the  DroDerties 
that  this  procedure  should  have  to  deserve  the  description 
"optimal."  Naturally  , such  concept  of  optimality  can  depend 


I I 


* 


1'  ' 
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upon  the  domain  of  empirical  study  and  it  must  depend  on  the 
subiective  preferences  of  its  author.  However,  once  the  op- 
timality is  defined,  the  mathematical  problem  occurs:  to  find 
the  "optimal,"  if  such  exists.  On  occasion  one  finds  that  the 
initially  defined  optimal  procedures  do  not  exist.  Too  bad! 
Then  one  has  to  look  for  a "compromise  optimality,"  etc.  One 
example  is  the  concept  of  "unbiased  most  powerful  tests" 

[Neyman  and  Pearson,  1936].  Here,  the  word  unbiased  marks  the 
compromise  ODtimalitv.  In  the  case  considered,  the  "uniformly" 
most  oowerful  test  does  not  exist. 

1 v • lHiL VULE-POLYft  RESEARCH  INCIDENT : (i)  MECHANISM 

OLA  NATURAL  PHENOMENON,  AND  (ii)  NON- IDENT  IF  I ABILITY 

6.  My  Contacts  with  George  Udny  Yule.  During  my  four 
year  long  activities  at  the  Department  of  Statistics,  Univer- 
sity College,  London  (1934-1938),  I had  the  privilege  of  meet- 
ing auite  a few  outstanding  scholars.  This  included  0.  U.  Yule 
for  whom  I developed  great  respect  and  warm  feelings. 

The  studies  of  Yule  that  attracted  my  particular  attention 
were  preformed  iointlv  with  Greenwood  [19?0].  Subseouently , 
a related  paper  was  published  by  E.  M.  Newbold  [1928]. 

Mv  preferred  wav  of  describing  these  studies  is  as  follows: 

They  are  concerned  with  the  chance  mechanism  oneratina  in  real 
life,  the  mechanism  that  determines  the  distribution  of  an  ob- 
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servable  random  variable  X.  If  this  mechanism  is  understood, 
it  could  be  used  to  solve  an  important  practical  problem. 

The  particular  random  variable  X of  the  Greenwood- Yu le- 
Newbold  studies  was  the  number  of  accidents  per  unit  of  time, 
per  bus  driver  in  London.  The  important  practical  problem 
considered  was  the  means  to  diminish  the  frequency  of  accidents 
involving  the  buses.  Fxactlv  similar  problems  are  important 
in  the  present  epoch,  even  though  the  actual  domain  of  study 
can  he  very  different.  One  example  is  the  question:  how  can 
one  diminish  the  frequency  of  deaths  from  cancer? 

The  problem  ot  accidents  was  studied  in  our  Stat.  lab.  in  the 
ear l v l'lSO’s.  Here  Professor  Grace  t,  Pates  played  an  important 
role  (Hates  and  Neyman,  l‘>M'a,  l()S?b]. 

The  first  of  theso  oarers  is  dedicated  to  the  memory  of  Georqe 
IMnv  Yule  and  is  preceded  bv  a one  pane  hioqraphical  sketch.  It  in- 
cludes tiie  tollowino  passaqe:  "In  b'.t]  Yule'  telt  that  he  was  too  old 
to  hold  tin*  position  of  Reader  at  Cambridoe  University  and  retired.  At 
t lie  same  time  he  felt  vouno  enouqh  to  learn  to  f 1 v . Accordinqlv,  he 
went  throuoh  the  intricacies  of  traininq,  not  a ni lot's  license  and 
hounht  a plane.  Unfortunately,  a heart  attack  cut  siic>rt  both  the  flvimi 
and,  to  a considerable  deoree,  his  scholarly  work." 

It  happened  that  my  personal  contacts  with  Yule  were  very 
limited.  They  occurred  durinq  the  period  when  he  was  recover- 
iiui  from  his  heart  attack.  However,  these  contacts  affected 
my  thinkinq.  In  particular,  they  contributed  to  the  ‘ormulation 
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of  my  paper  of  l(l37[b]. 

The  attempts  to  decrease  the  frequency  of  accidents  taking 
into  account  the  "human  factors,"  mentioned  in  the  title  of 
Miss  Newbold's  report,  are  connected  with  the  concept  now  called 
"accident  proneness."  There  is  little  doubt  that  particular 
individuals  do  differ  in  their  proneness  to  accidents  of  some 
specified  categories.  However,  the  details  of  this  variability 
are  not  clear  and  here  empirical  studies  are  important.  Durinq 
our  studies  in  the  earlv  1950's  our  thinking  was  affected  by 
two  contrasting  hypothetical  mechanisms.  One  of  them  is  the 
Greenwood-vule-Newhold  (GYN,  for  short)  hypothetical  mechanism, 
the  properties  of  which  can  be  summarized  as  the  "mixture  - no 
contagion  - no  time  effect"  mechanism.  The  other  hypothetical 
mechanism,  implied  by  studies  of  George  Pdlya  [ 1930] , was  ,iust.  the 
contrary:  "identity  of  individuals,  contagion  and  time  effect." 

To  be  more  specific:  the  GYN  mechanism  presupposed  that 
the  number  of  accident  incurred  bv  a particular  individual  per 
unit  of  time,  such  as  a year,  is  a Poisson  variable  with  a fixed 
expectation  \,  representing  this  individual's  personal  accident 

I 

proneness,  which  remains  unchanged  throuohout  his  active  life 
(=  "no  time  effect").  Another  basic  assumption  is  that  the 
value  of  \ varies  from  one  individual  to  the  next  (=  "mixture"). 

More  particularly,  the  assumption  was  adopted  that  the  variation 
of  \ within  a reinvent  population,  such  as  the  population  of 
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actual  or  potential  bus  drivers  in  London,  can  be  adequately 
represented  by  a gamma  distribution. 

Starting  with  these  basic  assumptions  it  was  easy  to 
deduce  that  the  number  of  accidents  per  year  incurred  by  in- 
dividual bus  drivers  must  have  a negative  binomial  distribution. 
Actually,  using  the  data  on  accidents  involving  bus  drivers 
it  was  found  that  this  distribution  could  be  well  fitted  by  a 
negative  binomial  so  that  the  GYN  mechanism  (or  shall  we  call 
it  ‘'model?")  appeared  to  have  been  "confirmed." 

Everything  appeared  nice  and  smooth  until  the  Pdlya  "model" 
was  examined.  As  described  above,  this  model  denied  the  existence 
of  a "mixture."  The  basic  assumption  was  that  all  individuals 
forming  the  population  of  actual  or  potential  employees  in  a 
particular  industry  were  "born  equal."  However,  it  was  assumed 
that  the  number  of  accidents  in  a time  interval  [t,  t+h),  where 
h is  a small  positive  number,  depends  upon  the  number  of  accidents 
incurred  before  time  t (=  "contagion").  Also,  there  was  the 
assumption  that,  as  the  duration  of  employment  increases,  the 
experience  gained  may  diminish  the  individual's  accident  prone- 
ness (=  "time  effect"). 

Usinq  these  specific  assumptions  suggested  b.v  the  famous 
Pdlya  oarer  of  1°30,  it  was  easv  to  calculate  the  distri- 

bution of  the  number  of  accidents  per  year  in  a population  com- 
parable to  that  of  the  London  bus  drivers.  Because  of  the  con- 
trast between  the  two  hypothetical  mechanisms,  the  GYN  and  the 
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Pdlya  mechanisms,  the  expectation  was  that  the  two  distributions 
would  be  very  different.  If  this  happened,  then  the  empirical 
data,  such  as  the  data  resulting  from  Miss  Newbold's  study  of 

, I 

the  London  bus  drivers  could  be  used  to  resolve  questions  like 

that  in  the  title  of  our  study  [1952b]:  "true  of  false  contagion?"  ! 

When  the  easy  calculations  of  the  relevent  probability 
generating  function  were  performed,  Dr.  Bates  and  I experienced 
a little  shock:  with  reference  to  a sinale  observational  period, 
such  as  a year,  the  Pdlya  "no  mixture  - contaqion  - time  effect" 
model  implied  that  the  distribution  of  the  number  of  accidents 
per  driver  must  be  a negative  binomial,  coinciding  with  that 
implied  by  the  Greenwood-Yule-Newbold  model!  This  finding  brought 
to  our  minds  several  ideas  that  apDear  important  to  this  day.  One 
is  the  concept  of  non-identifiabi 1 ity.  The  other  related  idea  is 
that  the  problem  of  validation  of  a hypothetical  mechanisms  of  a 
natural  phenomenon  deserves  a serious  effort.  One  hopeful  possi- 
bility is  that  the  non-identifiability  of  some  two  (or  more)  hy- 
pothetical mechanisms,  the  non-identifiability  with  respect  to 
the  distribution  of  a specific  sinale  random  variable  X,  mav  disappear 
just  as  soon  as  one  supplements  X by  some  other  appropriately 
selected  variables,  say  X-j , X?,...Xs. 

The  second  of  our  joint  papers  considers  a number  of 


not  too  difficult  empirical  studies  capable  of  providing  a 
definitive  answer  to  the  all  important  question  about  the  reality 
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of  "contagion"  in  acridents.  E.g.,  The  identifiabilitv  can  be  achieved 
bv  counting  accidents  of  eaih  driver  not  just  in  one  particular  year  (say 
X.  of  then) . but  also  those  incurred  during  the  following  year.  sav  X9  of 
them.  etc.  Roe  Grace  f.  Rates  [1055], 

This  section  concludes  my  description  of  the  Yule-P(5lva 
problem  as  it  came  to  my  attention  with  reference  to  industrial 
accidents:  what  is  the  governing  chance  mechanism?  Without 
much  risk  of  exaggeration  one  mav  assert  that  this  type  of  prob- 
lem is  encountered  in  every  serious  study  of  a complex  natural 
phenomenon.  In  cosmology:  what  is  the  chance  mechanism  govern- 
ing the  dispersal  of  clusters  of  galaxies?  How  can  one  verify 
any  rel event  hypothesis?  In  public  health:  what  is  the  mech- 
anism behind  the  observed  geographic  variability  in  the  in- 
cidence of  cancer?  Through  what  experiments  and  with  what 
statistical  methodology  can  one  gain  reliable  information?  In 
weather  modification  experiments:  what  are  the  processes  in 
the  atmosphere  that  follow  "cloud  seeding?"  What  statistical 
methodology  is  likely  to  provide  the  desired  information 
through  the  analysis  of  the  many  completed  experiments? 

Here,  a remark  on  terminology  seems  in  order.  It  seems 
to  me  that  the  common  use  of  the  term  "model"  deserves  a mod- 
ification or  restriction.  Mv  preference  would  be  to  restrict 
the  use  of  this  term  to  sets  of  (customarily)  Qualitative 
assumptions  advanced  to  explain  a natural  phenomenon.  One 
example  is  the  GYN  model  sugoested  to  explain  the  notorious 
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driver  to  driver  variability  in  the  number  of  accidents  per 
vear,  the  "mixture  - no  contaqion  - no  time  effect"  model. 

The  same  aoplies  to  the  Pd 1 va  "no  mixture  - contaqion  - time 
effect"  model.  This  use  of  the  term  "model"  appears  quite 
different  from  the  desiqnation  of  a mathematical  formula 
that  fits  the  observations.  One  frequently  encountered  ex- 
ample is  the  phrase  "linear  model,"  etc. 

Discussions  of  the  Yulp-P(51ya  dilemma  relatinq  to  the 
problem  of  public  health  will  be  found  in  the  next  chapter. 

V.  SOME  PRESENT  DAY  RECURRENCES  OF  THE  VULE-POLYA  D 1 LEMMA 

7.  Public  Heal th  Policy  and  Basic  Research.  The  impor- 
tance and  the  difficulty  of  the  present  dav  public  health  prob- 
lems overshadow  those  of  industrial  accidents  symbolized  by 
the  names  of  Yule  and  Pdlya.  However,  the  broadly  understood 
research  problems  remain  similar. 

One  of  the  tvpical  contemporary  public  health  problems 
is  concerned  with  the  hazards  from  electricity  producinq  plants 
[1977b] .briefly  as  follows.  A locality  L,  marked  hv  a raoidlv 
nrowinq  population,  is  in  need  of  a new  electricity  producinq 
olant.  This  may  be  either  a nuclear  facility  or  a fossil  fuel 
burninq  unit  and  the  choice  is  up  to  some  decision  makino 
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authorities.  Amona  other  thinqs,  the  choice  must  be  made  tahinq 
into  account  some  public  health  questions.  Whatever  type  of 
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plant  is  constructed,  it  will  contribute  to  the  local  pollution 
in  its  own  wav.  The  important  nuestions  are:  how  many  more 
cancer  cases,  heart  attacks,  etc.  are  to  be  expected  in  this 
locality  L as  a result  of  the  predictable  extra  pollution 
from  the  normal  operation  of  the  novel  electric  qenerator? 

How  can  one  answer  this  question  reliablv? 

The  reliability  of  the  answer  depends  upon  the  understand- 
ing of  two  different  mechanisms.  One  mechanism  is  concerned 
with  the  happenings  in  experimental  animals,  mice,  dogs,  etc., 
subjected  to  a specified  chanoe  in  the  envi ronmental  pollution. 
The  other  important  mechanism  is  that  of  the  dependence  of  the 
effects  of  the  first  mechanism  on  the  identity  of  the  species 
concerned,  whether  mouse,  or  rat,  or  doq , or  man.  Obviously, 
the  complexity  of  the  problem  is  tremendous.  It  splits  itself 
into  a number  of  subproblems.  In  the  next  section,  we  shall 
consider  one  of  these  subproblems.  If  involves  the  ubiouitous 
phenomenon  of  non- identi f iabi 1 i ty . 

8.  Typical  "Survival  Experiment"  and  the  Methodology 
of  Potential  J'unyjvaJ  Tjimes . " The  customary  source  of  infor- 
mation on  the  happenings  in  the  experimental  animals,  sav  mice, 
exposed  to  some  "agents"  studied  is  a "survival  experiment." 
There  are  two  substantial  groups  of  mice,  one  labeled  "exper- 
imental" and  the  other  "controls."  The  experimental  mice  are 
exposed  to  the  aoents  studied  and  the  controls  are  not.  When 
a mouse  of  either  oroup  dies,  its  body  is  subjected  to  a oath- 


oloqical  studv  and  an  effort  is  made  to  determine  the  cause 
of  its  death.  With  a decree  of  oversimplification,  it  is 
oostulated  that  there  is  a somewhat  limited  number  of  possible 
causes  of  death,  sav  K of  them.  The  problem  studied  is  that 
of  the  difference  in  death  rates  from  the  different  causes 
amonq  the  experimental  and  the  control  mice.  This  is  only  a 
rouqh  description  of  the  problem.  One  of  the  difficulties 
that  became  obvious  on  closer  examination  is  due  to  the  omni- 
present phenomenon  of  "competinq  risks."  One  illustrative 
examnle  is  as  follows. 

All  of  us  alive  todav  are  exposed  to  a variety  of  risks 
of  death,  includinq  street  traffic  and  cancer.  If  I am  run 
over  and  killed  bv  a car  toniqht,  it  would  be  impossible  for 
me  to  die  later  from  cancer  and,  in  due  course,  this  would 
affect  the  published  death  rates  from  cancer.  In  consequence, 
the  numerical  results  of  a survival  experiment  with  mice  do 
not  characterize  "net  rates"  of  deaths  from  the  various  causes 
of  death  studied  but  only  the  "crude  rates."  These  crude  rates 
correspondi no  to  the  different  causes  (or  "risks")  studied 
characterize  not  only  the  intensities  of  particular  risks,  but  thev 
also  reflect  the  combined  property  of  all  of  them  that  is  due  to 
competition.  Now,  let  us  visualize  the  results  of  a completed 
survival  experiment  after  all  the  mice,  sav  of  the  experimental 


oroup.  have  died. 
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Table  1 illustrates  the  obtainable  results. 

Table  1 

Illustration  of  the  results  of  a survival  experiment. 


Cause  of  Death  Survival  Times  of  Particular  Mice 


C1 

tll  = t12  = t13  •' 

••  ' tln 

C2 

< < 

< 

l21  = *22  = l23  •' 

'•  = l2n 

The  first  column  of  Table  1 enumerates  all  the  K causes 
of  death.  The  wide  second  column  aives  the  correspondinq 
consecutive  survival  times  of  mice  that  died  from  the  parti- 
cular causes.  Thus,  for  example,  the  symbol  t^  stands  for 
the  time  of  the  first  recorded  death  from  cause  . Similar- 
ly, the  last  symbol  in  the  same  line,  namely  t,  reoresents 

,nl 

the  tine  of  death  of  the  last  mouse  that  died  from  the  same 
ause  C^,  etc.  Here,  then,  the  subscripts  n-| , n2»  n^ 
denote  the  numbers  of  mice  that  died  from  causes  C-| , Co, 

C^,  respectively.  Naturally,  these  numbers  n-j , no n^ 

will  not  be  all  eoual  and  their  variability  will  reflect  both 
the  severity  of  particular  causes  and  their  competition.  The 
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reader  will  have  no  difficulty  in  visualizing  an  exactly 
similar  table  compiled  for  the  control  mice.  These  two 
tables  would  then  be  ready  for  the  evaluation  of  the  effects 
of  the  agents  studied  on  the  survival  experience  of  the  mice. 

Having  in  one's  mind  the  Droblem  of  a new  electric  gen- 
erator in  locality  L,  one  might  think  of  the  question:  how 
many  more  deaths  from  cancer  (perhaps  cause  C-|)  should  one 
expect  among  mice  if  the  "agents"  studied  included  irradiation? 

What  about  the  methodology  of  evaluating  the  experiment  that 
could  answer  reliably  a question  of  this  kind? 

One  of  the  methodologies  used  is  that,  based  on  the  concept 
of  "potential  survival  times."  For  an  experimental  animal  ex- 
posed to  K possible  risks  (or  causes)  of  death,  the  term  i-th 
potential  survival  time  designates  a random  variable  supposed 
to  represent  the  age  at  death  of  this  animal  in  the  hypothetical 
condition  in  which  is  the  only  possible  cause  of  death.  The 
probability  that  Y.  will  exceed  a preassigned  value  t is  called 
the  "net  survival  probability." 

Unfortunately,  while  a survival  experiment  can  be  conducted 
to  investigate  a great  variety  of  different  "agents,"  the  re- 
sulting "causes"  of  death  are  not  under  control  of  the  experimentor. 
Thus,  no  direct  empirical  counterpart  of  the  net  survival  prob- 
ability can  be  available.  All  that  the  results  of  a survival 
experiment  illustrated  in  Table  1 can  provide  is  the  empirical 
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counterparts  of  the  so-called  "crude  survival  probabilities." 

For  the  i-th  cause  the  crude  probability  of  surviving  up  to 
tine  t,  say  Q^(t)  is  the  probability  that  Y^  = min(Yj,  Yo,  ... 

Y^)  and  that  Y^t.  Here,  then,  the  question  arises  whether 
a statistical  methodology  could  be  developed  to  use  the  crude 
survival  aata  as  in  Table  1,  perhaps  somehow  supplemented,  in 
order  to  estimate  the  net  survival  probabilities. 

As  interestingly  described  by  David  [1974],  the  competing 
risk  phenomenon  occurs  not  only  in  problems  of  public  health 
but  also  in  problems  of  technological  reliability.  Here,  the 
most  attractive  presumption  supplementing  the  data  of  a survival 
experiment  is  the  assumption  that  the  potential  survival  times 
Y.  are  mutually  independent.  However,  the  hypothesis  of  in- 
dependence cannot  be  tested  using  the  data  of  a survival  ex- 
periment and  the  publications  of  Tsiatis  [1975]  and  of  Peterson 
[1976]  document  the  presence  of  non-identifiability.  The  crude 
survival  probabilities  are  consistent  with  an  infinity  of  systems  of 
widely  different  net  survival  probabilities.  The  conclusion 
is  that  the  survival  experiment  of  the  type  described  is  too 
simplistic  to  provide  all  the  valuable  information  for  studies 
of  problems  of  health. 

9.  Survival  Experiments  with  Serial  Sacrifice.  The  'serial 
sacrifice"  methodology  [Upton,  1969]  represents  a very  important 
advance  in  the  health  related  experimentation.  Rather  than  focus 
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on  the  diagnosed  "causes"  of  death  of  the  experimental  animals, 
the  serial  sacrifice  experimentation  deals  with  what  I like  to 
call  “elementary  pathological  states,"  say  S-j , S2,  ...  S^. 

For  example  may  stand  for  thymic  lymphoma  (a  cancer),  for 
reticulum  cell  sercoma,  another  cancer,  etc.  At  selected  times, 
say  t^,  t2,  •••  samples  of  mice  alive  at  these  times  are  killed 
(=  "sacrifice")  and  their  bodies  are  subjected  to  a patholoqi cal 
analysis.  The  result  of  such  analysis  for  a particular  mouse 
may  be  that,  at  the  time  of  its  sacrifice,  it  was  affected  by, 
say,  three  elementary  pathological  states,  S^,  Sg,  Sg,  and  no 
others . 

The  above  methodology  Drovides  empirical  counterparts  to 
the  following  type  of  Questions:  how  frequently  the  mice  alive 
at  the  preassigned  times  t-j , t2,  ...  are  affected  by  this  or  that 
combination  of  patholoqical  states?  Combined  with  similar  data 
for  mice  that  died  on  their  own  (not  through  "sacrifice")  the 
amount  of  information  from  a serial  sacrifice  experiment  is  very 
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in  Section  6,  the  non-identifiabi 1 ity  of  two  contrasting 
mechanisms  of  accident  proneness  was  due  to  the  insufficiency 
of  observat i onal  data:  numbers  of  accidents  incurred  durinq 
a single  vear.  The  counts  of  accidents  incurred  bv  each 
driver  the  following  vear  made  the  non- identi fi abi 1 i ty  dis- 
appear. It  is  this  analoqy  that  is  symbolized  by  reference 
to  the  "Yule-Polya  dilemma"  in  the  title  of  the  present  Chap- 
ter V. 

I learned  about  the  serial  sacrifice  desiqn  during  a visit 
to  the  Oak  Ridqe  National  Laboratory  and,  particularly,  through 
conversations  with  Dr.  John  B.  Storer.  At  the  time  Dr.  Storer 
was  in  charge  of  the  continuing  experiment  set  up  bv  Upton.  Later, 
we  had  the  pleasure  of  Dr.  Storer's  visit  to  Berkeley.  Also,  we 
received  from  him  a substantial  sample  of  data  from  the  experiment 
in  question.  In  these  data,  the  total  number  of  elementary  path- 
ological states  was  eight.  The  further  difference  with  the  "typical" 
survival  experiment  was  that  there  were  no  "causes"  of  death  indicated. 

While  all  human  determinations  are  subject  to  error,  the 
determination  of  particular  pathological  states  is  comparable  to 
chemical  analyses  and  represents  an  effort  at  objectivity.  On 
the  other  hand,  the  diagnosis  of  a "cause"  of  death  is  a conclusion 
likely  to  be  affected  bv  subjective  attitudes  of  the  pathologists. 

10.  Another  SJio£k  of  Non-Identifiabilitv.  As  mentioned  in 
Section  6,  the  finding  of  non-identi fiabi 1 ity  affecting  the  study 
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of  accident  proneness  caused  Dr.  Bates  and  myself  to  experience 
a shock.  Here,  I have  to  admit  a somewhat  explosive  feeling 
of  enthusiasm  I felt  when  contemplating  the  experimental  results 
obtainable  through  serial  sacrifice  experiment.  I rather  felt 
that  these  results,  without  any  additional  observations,  provide 
data  for  the  study  of  a stochastic  process  representing  the 
natural  succession  of  life  and  death  events:  birth  at  time  zero, 
followed  by  first  illness  at  age  t-| , then  by  recovery  at  time  t2> 
etc.  etc.,  and  finally  death  at  some  observable  time.  Because 
the  domain  of  stochastic  processes  is  now  well  developed,  I ex- 
pected that  a statistical  methodology  could  be  discovered  to 
use  the  serial  sacrifice  data  in  order  to  estimate  the  mechanism 
of  treatment  effects  in  mice  contemplated,  perhaDS,  or  a realiza- 
tion of  a finite  states  Markov  chain,  with  all  the  transition 
probabilities  possible  to  estimate.  Due  to  the  work  of  Clifford 
[1977],  I experienced  a shock.  Even  with  some  over-simplifying 
assumptions  (denying  the  possibility  of  "recovery,"  etc.)  a 
discrete  time  Markov  chain  model  proved  to  be  unidentifiable  with 
respect  to  the  data  of  a serial  sacrifice  experiment!  The  details 
are  described  in  the  analysis  of  Storer's  data  performed  with 
Clifford's  active  participation  [ Berlin  et  al , 1979]. 

While  the  serial  sacrifice  data  provide  answers  to  the 
questions  "how  frequently  mice  sacrificed  at  age  t are  affected 
by  a stated  combination  of  pathological  states,"  the  missing 
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Information  relates  to  mice  alive  at  acje  t and  having  at  that 
age  a stated  pathological  combination.  During  the  subsequent 
unit  of  time,  say  during  the  next  100  days,  the  health  state  of 
these  mice  can  change  in  many  different  ways:  recover  from  some 
illnesses,  contract  some  others,  etc.  With  the  present  design 
of  serial  sacrifice  experiments  there  is  no  information  on  the 
frequency  of  such  transitions.  The  tantalizing  question  is  whether 
some  not  too  difficult  modification  of  the  methodology  could 
provide  information  to  fill  in  the  now  existing  gaps.  The  way 
of  discovering  such  effective  modifications  requires  a reason- 
ably close  cooperation  between  an  intensely  interested  statistician 
and  an  equally  intensely  interested  experimenting  biologist.  The 
questions  to  resolve  are  of  the  following  type:  could  the  analysis 
of  urine  of  a mouse  provide  enough  information  on  its  health  state? 

Could  the  analysis  of  a blood  sample  be  sufficient?  However,  can 
this  sample  of  blood  be  taken  without  altering  the  contemporary 
transition  probabilities  of  the  mouse,  i.e..  without  hurting  the1 
mouse?  Who  knows?  However,  unless  one  tries,  one  can  hardly  hope 
to  succeed. 

I 

5 

VI.  IF  FORT  AT  AN  "OPTIMAl"  COMPETITOR  TO  k.P.'S  TtST  TOR  GOODNESS  OF  FIT 


10.  Introductory  Remarks.  This  chapter  is  to  illustrate 
nV  preferred  strategy  of  studying  or  of  developing  statistical 
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tests:  begin  by  defining  the  optimal  performance  of  the  test, 

and  then  try  to  deduce  the  desired  criterion.  As  indicated  in 

the  title  of  the  chapter,  the  example  chosen  for  i llustration  is 

2 

the  Karl  Pearson's  test  "for  goodness  of  fit"  symbolized  by  x • 

As  is  well  known,  the  y test  is  now  being  used  for  a 
variety  of  purposes,  such  as  contingency  tables,  etc.  In  these 
circumstances , I wish  to  emphasize  the  limited  scope  of  the 
following  discussion:  it  is  concerned  with  the  problem  of  "goodness 
of  fit"  as  comtemplated  in  olden  days  by  K.P.  My  actual  effort  to 
formulate  the  problem  and  to  solve  it  was  published  in  1937[b].  It 
is  limited  to  the  case  of  a "simple  hypothesis,"  this  is,  to  the 
case  in  which  the  problem  is  to  decide  whether  a completely 
specified  probability  density,  say  px(x)fits  the  empirical  dis- 
tribution of  an  observable  random  variable  X.  Another  limitation 
consists  in  the  assumption  that  the  number  N of  observed  values 
of  X is  "large."  The  problem  of  extending  the  methodology  to  the 
case  of  composite  parametric  hypotheses  has  been  treated  by 
Javitz  [1975]. 

2 

1 1 . Criticism  of  the  K.P.'s  y Test  for  Goodness  of  Fit . 

An  effort  at  an  "optimal"  competitor  of  an  existing  test  intended 
for  use  in  some  specified  conditions  must  beain  by  the  unavoidably 
subjective  criticism  of  the  original  test.  The  well  known  orocedure 

o 

of  the  x test  for  goodness  of  fit  begins  by  dividinq  the  range  of 
variation  of  the  observable  X into  a certain  number,  say  s,  of  "cells," 


w 
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(5) 


with  boundaries 

ao'ara?  •••  <ras* 
where  aQ  may  mean  - > and  a^  may  be  +>».  Next,  the  probability 

densitv  ;^(*)is  used  to  compute  the  expected  number  of  independent 


observat ions , say  n. , falling  into  the  i-th  cell  for  1 = 1,  2,  ...  s. 


Let  m.  denote  the  actual  number  out  of  the  total  N observations 


that  fall  into  the  same  i-th  cell.  Then,  K.P.'s  test  criterion 
for  goodness  of  fit  is  given  bv 


*7 

o S (m . -n . )L 

1 I 
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(6) 


The  fit  is  considered  "bad"  if  the  calculated  \ exceeds  the 
tabled  limit  corresponding  to  the  chosen  level  of  significance. 
Otherwise,  the  fit  is  considered  "good." 

My  own  subjective  criticism  of  the  test  includes  the  fact 
that  the  value  of  the  criterion  (6)  does  not  depend  on  the  order 
of  positive  and  negative  differences  (m.-n^).  The  extreme  example 
is  represented  by  the  following  possibilities.  In  one  case,  the 
siqns  of  the  consecutive  differences  m^-n^  and  m^-n^i  are  not 
the  same.  In  the  other  case  one  can  observe  a substantial  number 


of  consecutive  differences  m.j-n^  that  are  all  negative  while  all 


the  others  are  positive.  While  these  two  possibilities  are  con- 
sistent with  the  same  value  of  the  criterion  (6),  my  intuitive 
feelinq  is  that  in  the  second  case  the  "goodness  of  fit"  is  subject 
to  a rather  strong  doubt,  irrespective  of  the  actual  computed  value 
of  (6),  even  if  it  happens  to  be  small. 


J 
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12.  "Smooth  Test"  for  Goodness  of  Fit.  The  first  step  in 

the  deduction  of  the  "smooth  test"  intended  as  an  "optimal"  com- 

o 

petitor  to  \ , consisted  in  standardizing  the  analytical  develop- 
ments. Rather  than  consider  the  great  variety  of  distributions 
p x ( x ) that  may  come  under  consideration,  1 proposed  to  replace  the 
observable  X by  its  function  Y defined  by  the  relation 
x 

y = / px(x)dx  (7) 

-CO 

when  y and  x designate  particular  values  of  the  two  random  var- 
iables. As  it  is  easy  to  check,  the  range  of  variation  of  Y is 
from  zero  to  unity,  with  its  probability  density 

Py(y ) =1.  (8) 

this,  irrespective  of  the  distribution  of  X. 

As  contemplated  by  Karl  Pearson,  the  background  of  the  prob- 
lem of  goodness  of  fit  admits  the  possibility  that  the  specified 
density  of  Px(x)  may  not  correspond  to  reality.  However,  there 
are  no  general  indications  as  to  what  the  alternatives  might  be. 

In  my  attempt  to  deduce  an  optimal  competitor  to  the  chi  square 
test,  I contemplated  the  set  of  alternatives  vaguely  described  as 
"smooth . " 

In  terms  of  the  variable  Y,  with  its  range  of  variation  limited 


I 


to  the  interval  (0,  1)  where  its  density  is  equal  to  unity,  the 
contemplated  "smooth"  alternatives  are  those  with  densities  the 
logarithms  of  which  are  polynomials  of  orders  1,  ...  K. 
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The  theory  published  in  1937  develops  an  asymptotic  version 

of  optimal  unbiased  type  f tests  of  orders  K=l,  2 with 

K denoting  the  order  of  polynomial  used.  The  study  of  asymp- 
totic power  of  these  tests  indicates  that,  generally,  adequate 
results  could  be  obtained  with  K not  exceeding  4.  The  tests 
so  deduced  are  not  open  to  the  criticism  of  the  original  test 
for  goodness  of  fit  indicated  above. 

In  recent  times  quite  a few  non-parametric  tests  for  good- 
ness of  fit  have  been  considered  with  emphasis  on  their  robust- 
ness. It  would  be  interesting  to  use  the  Monte  Carlo  method- 
ology to  compare  the  performance  of  these  tests  with  that  of  the 
smooth  test  of  a limited  order  K=4. 
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